Recognition of speech produced in noise.

نویسندگان

A L Pittman

T L Wiley

چکیده

A two-part study examined recognition of speech produced in quiet and in noise by normal hearing adults. In Part I 5 women produced 50 sentences consisting of an ambiguous carrier phrase followed by a unique target word. These sentences were spoken in three environments: quiet, wide band noise (WBN), and meaningful multi-talker babble (MMB). The WBN and MMB competitors were presented through insert earphones at 80 dB SPL. For each talker, the mean vocal level, long-term average speech spectra, and mean word duration were calculated for the 50 target words produced in each speaking environment. Compared to quiet, the vocal levels produced in WBN and MMB increased an average of 14.5 dB. The increase in vocal level was characterized by increased spectral energy in the high frequencies. Word duration also increased an average of 77 ms in WBN and MMB relative to the quiet condition. In Part II, the sentences produced by one of the 5 talkers were presented to 30 adults in the presence of multi-talker babble under two conditions. Recognition was evaluated for each condition. In the first condition, the sentences produced in quiet and in noise were presented at equal signal-to-noise ratios (SNR(E)). This served to remove the vocal level differences between the speech samples. In the second condition, the vocal level differences were preserved (SNR(P)). For the SNR(E) condition, recognition of the speech produced in WBN and MMB was on average 15% higher than that for the speech produced in quiet. For the SNR(P) condition, recognition increased an average of 69% for these same speech samples relative to speech produced in quiet. In general, correlational analyses failed to show a direct relation between the acoustic properties measured in Part I and the recognition measures in Part II.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...

متن کامل

روشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه

Performance of speech recognition systems is greatly reduced when speech corrupted by noise. One common method for robust speech recognition systems is missing feature methods. In this way, the components in time - frequency representation of signal (Spectrogram) that present low signal to noise ratio (SNR), are tagged as missing and deleted then replaced by remained components and statistical ...

متن کامل

Envelope-based inter-aural time difference localization training to improve speech-in-noise perception in the elderly

Background: Many elderly individuals complain of difficulty in understanding speech in noise despite having normal hearing thresholds. According to previous studies, auditory training leads to improvement in speech-in-noise perception, but these studies did not consider the etiology, so their results cannot be generalized. The present study aimed at investigating the effectiveness of envelope-b...

متن کامل

Improving the performance of MFCC for Persian robust speech recognition

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...

متن کامل

Effects of ageing on speed and temporal resolution of speech stimuli in older adults

Background: According to previous studies, most of the speech recognition disorders in older adults are the results of deficits in audibility and auditory temporal resolution. In this paper, the effect of ageing on timecompressed speech and auditory temporal resolution by word recognition in continuous and interrupted noise was studied. Methods: A time-compressed speech test (TCST) w...

متن کامل

A New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain

Quality of speech signal significantly reduces in the presence of environmental noise signals and leads to the imperfect performance of hearing aid devices, automatic speech recognition systems, and mobile phones. In this paper, the single channel speech enhancement of the corrupted signals by the additive noise signals is considered. A dictionary-based algorithm is proposed to train the speech...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Journal of speech, language, and hearing research : JSLHR

دوره 44 3 شماره

صفحات -

تاریخ انتشار 2001

Recognition of speech produced in noise.

نویسندگان

چکیده

منابع مشابه

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

روشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه

Envelope-based inter-aural time difference localization training to improve speech-in-noise perception in the elderly

Improving the performance of MFCC for Persian robust speech recognition

Effects of ageing on speed and temporal resolution of speech stimuli in older adults

A New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain

عنوان ژورنال:

اشتراک گذاری